Initialization

In order to use the API provided by GFS, we should identify ourself by using the “key” when requesting any information. We should save this key as a variable for easy access later.

library(gfwr)
key <- gfw_auth()    #Access the token and save as "key"

To search about vessels information using identifiers and dataset categories

Every vessel has a unique identifier. It can be MMSI, shipname, id, etc. We can use any of these identifier to find out a specific vessel and retrieve information about this vessel. We use the function get_vessel_info to make a request using API. Within this function, there are several arguments for us to specify, which is very similar with “searching option” when we use google. 1: query is where we input the identifier. 2: There are three search_type: basic, advanced, and id. “basic” search takes features like MMSI, IMO, callsign, shipname as inputs and identifies all vessels in the specified dataset that match. “advanced” search allows for the use of fuzzy matching with terms such as LIKE. “id” search allows the user to search using a GFW vessel id allows the user to specify the vessel id (generated by GFW). 3: The user can also specify which identity dataset to use: carrier_vessel, support_vessel, fishing_vessel, or all. With the latter, all databases are used for the search. This is generally recommended and is the option set by default. 4: key: indicating who we are when making the request. (the first “key” is for argument name, and th e second key is the variable name “key” which we used to store our key previously.)

Example: To get information of a vessel with MMSI = 224224000 using all datasets:

## # A tibble: 1 × 17
##    name callsign firstTransmissionDate flag  geartype id                   imo  
##   <int> <chr>    <chr>                 <chr> <lgl>    <chr>                <chr>
## 1     1 EBSJ     2015-10-13T15:47:16Z  ESP   NA       3c99c326d-dd2e-175d… 8733…
## # ℹ 10 more variables: lastTransmissionDate <chr>, mmsi <chr>, msgCount <int>,
## #   posCount <int>, shipname <chr>, source <chr>, vesselType <chr>,
## #   years <list>, dataset <chr>, score <dbl>

To combine different fields and do fuzzy matching to search the carrier vessel dataset:

get_vessel_info(query = "shipname LIKE '%GABU REEFE%' OR imo = '8300949'", 
                search_type = "advanced", dataset = "carrier_vessel", key = key)
## # A tibble: 3 × 17
##    name callsign firstTransmissionDate flag  geartype id                   imo  
##   <int> <chr>    <chr>                 <chr> <lgl>    <chr>                <chr>
## 1     1 ER2732   2019-02-22T21:46:13Z  MDA   NA       0b7047cb5-58c8-6e63… 8300…
## 2     2 D6FJ2    2012-01-02T16:50:42Z  COM   NA       58cf536b1-1fca-dac3… 8300…
## 3     3 TJMC996  2022-01-24T09:13:48Z  CMR   NA       1da8dbc23-3c48-d5ce… 8300…
## # ℹ 10 more variables: lastTransmissionDate <chr>, mmsi <chr>, msgCount <int>,
## #   posCount <int>, shipname <chr>, source <chr>, vesselType <chr>,
## #   years <list>, dataset <chr>, score <dbl>

To specify a vessel id or multiple ids: (Note that the ids have to be listed in this way with comma and in same line)

get_vessel_info(query = "8c7304226-6c71-edbe-0b63-c246734b3c01,6583c51e3-3626-5638-866a-f47c3bc7ef7c,71e7da672-2451-17da-b239-857831602eca", 
                search_type = 'id', key = key)
## # A tibble: 3 × 16
##    name callsign firstTransmissionDate flag  geartype          id          imo  
##   <int> <chr>    <chr>                 <chr> <chr>             <chr>       <chr>
## 1     1 5BWC3    2013-05-15T20:18:31Z  CYP   <NA>              8c7304226-… 9076…
## 2     2 DTBY3    2013-09-02T03:59:51Z  KOR   tuna_purse_seines 6583c51e3-… 8919…
## 3     3 DUQA-7   2017-02-15T05:54:53Z  PHL   tuna_purse_seines 71e7da672-… 8118…
## # ℹ 9 more variables: lastTransmissionDate <chr>, mmsi <chr>, msgCount <int>,
## #   posCount <int>, shipname <chr>, source <chr>, vesselType <chr>,
## #   years <list>, dataset <chr>

To know about “events” (fishing, transshipment, port visit) happening on a specific vessel

We can retrieve information on not only the vessels, but also events with the API. Event types supported by the API include: apparent fishing events, potential transshipment events (two-vessel encounters and loitering by refrigerated carrier vessels), and port visits. You can visit (https://globalfishingwatch.org/our-apis/documentation#data-caveat) for how these events are estimated in detail. The function we use to make a request for events is “get_event”. The arguments are event_type, vessel, confidences, and key. 1: event_type is where we specify the types of events we would like to search 2: vessel specify which vessel we would like to focus on, and we can use the vessel id for this argument 3: confidences: This param applies only to port visits events. A port visit with low confidence (level = 2) indicates that only a port stop or gap event was detected using AIS within the port.Only a port stop OR gap was identified based on AIS transmission. Medium confidence (level = 3) indicates that a port entry or exit was detected using AIS, along with a stop or gap within the port. High confidence (level = 4) indicates that the vessel was identified using AIS with an entry, stop or gap, and exit within port. A port visit with a lower confidence may sometimes be a false port visit caused by noisy AIS transmission and requires a further inspection of the vessel tracks.

Example:

Suppose we only have MMSI of a vessel, and we would like to search about the events from this vessel. Here are the steps. Firstly, we can get the vessel ID using MMSI (vice versa). $id indicates we want to focus on the “id” info among all information returned by get_vessel_info

vessel_id <- get_vessel_info(query = 224224000, search_type = "basic", key = key)$id

To get a list of port visits for that vessel with get_event, we specify the type, the vessel id, confidences and our key. We will see a list of events being returned, including the date, duration, location, etc. Also each event has a event id (which is different with vessel id)

get_event(event_type='port_visit',
          vessel = vessel_id,
          confidences = '4',
          key = key
)
## [1] "Downloading 35 events from GFW"
## # A tibble: 35 × 11
##    id    type  start               end                   lat    lon regions     
##    <chr> <chr> <dttm>              <dttm>              <dbl>  <dbl> <list>      
##  1 b725… port… 2015-11-04 05:22:13 2015-11-07 10:46:28  5.23  -4.00 <named list>
##  2 f03f… port… 2015-12-06 11:48:38 2015-12-10 16:19:37  5.24  -4.08 <named list>
##  3 cbd7… port… 2016-01-09 06:47:57 2016-01-13 14:30:33  5.24  -4.00 <named list>
##  4 6265… port… 2016-02-25 14:26:38 2016-03-01 13:21:21  5.25  -4.00 <named list>
##  5 4a7f… port… 2016-03-03 05:47:02 2016-03-03 11:46:33  5.20  -4.02 <named list>
##  6 617d… port… 2016-03-31 04:43:41 2016-04-02 09:07:10  5.23  -4.00 <named list>
##  7 3c26… port… 2016-04-20 06:50:58 2016-04-20 19:47:10 14.7  -17.4  <named list>
##  8 104e… port… 2016-04-24 07:14:33 2016-04-24 11:54:59 14.7  -17.4  <named list>
##  9 8f19… port… 2016-05-18 19:31:04 2016-05-22 14:20:05  5.20  -4.01 <named list>
## 10 bf64… port… 2016-06-26 15:08:16 2016-06-30 10:39:03  5.20  -4.07 <named list>
## # ℹ 25 more rows
## # ℹ 4 more variables: boundingBox <list>, distances <list>, vessel <list>,
## #   event_info <list>

We can also use more than one vessel id by listing all the ids. Also, we can specify the start date and end date for the events to look at a specific duration of events.

get_event(event_type='port_visit',
          vessel = '8c7304226-6c71-edbe-0b63-c246734b3c01,6583c51e3-3626-5638-866a-f47c3bc7ef7c',
          confidences = 4,
          start_date = "2020-01-01",
          end_date = "2020-02-01",
          key = key
)
## [1] "Downloading 3 events from GFW"
## # A tibble: 3 × 11
##   id      type  start               end                   lat   lon regions     
##   <chr>   <chr> <dttm>              <dttm>              <dbl> <dbl> <list>      
## 1 7cd1e3… port… 2019-12-19 23:05:31 2020-01-24 19:05:18  28.1 -15.4 <named list>
## 2 c2f096… port… 2020-01-26 05:52:47 2020-01-29 14:39:33  20.8 -17.0 <named list>
## 3 7c06e4… port… 2020-01-31 02:20:08 2020-02-03 15:56:31  28.1 -15.4 <named list>
## # ℹ 4 more variables: boundingBox <list>, distances <list>, vessel <list>,
## #   event_info <list>

Let’s try another event type

get_event(event_type='encounter',
          start_date = "2020-01-01",
          end_date = "2020-02-01",
          key = key
)
## [1] "Downloading 1516 events from GFW"
## # A tibble: 1,516 × 11
##    id                type  start               end                    lat    lon
##    <chr>             <chr> <dttm>              <dttm>               <dbl>  <dbl>
##  1 a3cff76a070a919f… enco… 2019-12-31 08:40:00 2020-01-01 07:40:00  57.5   157. 
##  2 a3cff76a070a919f… enco… 2019-12-31 08:40:00 2020-01-01 07:40:00  57.5   157. 
##  3 b059d20534c7fd5f… enco… 2019-12-31 12:00:00 2020-01-01 13:50:00 -17.6   -79.3
##  4 b059d20534c7fd5f… enco… 2019-12-31 12:00:00 2020-01-01 13:50:00 -17.6   -79.3
##  5 cd07d7e5d65e81b3… enco… 2019-12-31 12:50:00 2020-01-01 09:50:00 -17.7   -79.2
##  6 cd07d7e5d65e81b3… enco… 2019-12-31 12:50:00 2020-01-01 09:50:00 -17.7   -79.2
##  7 13dac0526c993292… enco… 2019-12-31 14:50:00 2020-01-01 20:20:00 -17.6   -79.4
##  8 13dac0526c993292… enco… 2019-12-31 14:50:00 2020-01-01 20:20:00 -17.6   -79.4
##  9 2e8b8040d87ad0ae… enco… 2019-12-31 16:00:00 2020-01-01 08:50:00  -3.44 -147. 
## 10 2e8b8040d87ad0ae… enco… 2019-12-31 16:00:00 2020-01-01 08:50:00  -3.44 -147. 
## # ℹ 1,506 more rows
## # ℹ 5 more variables: regions <list>, boundingBox <list>, distances <list>,
## #   vessel <list>, event_info <list>

Another example: let’s combine the Vessels and Events APIs to get fishing events for a list of 10 USA-flagged trawlers. We firstly search about all these vessels which come from USA, and has geartype “trawlers” using the function get_vessel_info. Then we save the ids for the first 100 vessels that satisfy the options.

# Download the list of USA trawlers
usa_trawlers <- get_vessel_info(
  query = "flag = 'USA' AND geartype = 'trawlers'", 
  search_type = "advanced", 
  dataset = "fishing_vessel",
  key = key
)

# Collapse vessel ids into a commas separated list to pass to Events API (In human language it is to generate a ids list in the format it wants for later use)
usa_trawler_ids <- paste0(usa_trawlers$id[1:100], collapse = ',')

Now get the list of fishing events for these USA trawlers in January, 2020 (Using the previous ids list)

get_event(event_type='fishing',
          vessel = usa_trawler_ids,
          start_date = "2020-01-01",
          end_date = "2020-02-01",
          key = key
)
## [1] "Downloading 44 events from GFW"
## # A tibble: 44 × 11
##    id    type  start               end                   lat    lon regions     
##    <chr> <chr> <dttm>              <dttm>              <dbl>  <dbl> <list>      
##  1 c0c3… fish… 2020-01-08 02:35:49 2020-01-08 05:57:16  30.2  -88.2 <named list>
##  2 8834… fish… 2020-01-08 06:20:18 2020-01-08 13:41:45  30.2  -88.2 <named list>
##  3 f043… fish… 2020-01-11 03:12:54 2020-01-11 08:01:59  24.8  -82.5 <named list>
##  4 0ef9… fish… 2020-01-14 02:54:19 2020-01-14 13:53:57  29.2  -90.1 <named list>
##  5 7b08… fish… 2020-01-14 07:22:19 2020-01-14 08:43:08  41.1  -69.3 <named list>
##  6 48ec… fish… 2020-01-14 17:38:18 2020-01-14 20:18:15  41.1  -69.3 <named list>
##  7 411a… fish… 2020-01-15 05:38:53 2020-01-15 08:46:53  41.7  -69.7 <named list>
##  8 fc33… fish… 2020-01-15 11:08:05 2020-01-15 15:38:34  41.7  -69.6 <named list>
##  9 9547… fish… 2020-01-20 15:33:55 2020-01-21 02:56:44  55.3 -164.  <named list>
## 10 2acd… fish… 2020-01-20 23:38:15 2020-01-21 02:03:40  54.2 -166.  <named list>
## # ℹ 34 more rows
## # ℹ 4 more variables: boundingBox <list>, distances <list>, vessel <list>,
## #   event_info <list>

When no events are available, the get_event() function returns nothing.

get_event(event_type='fishing',
          vessel = usa_trawler_ids,
          start_date = "2020-01-01",
          end_date = "2020-01-01",
          key = key
)
## [1] "Your request returned zero results"
## NULL

Map API

The get_raster function gets a raster and converts the response to a data frame. In order to use it, you should specify:

# This is a manually input polygon region

region_json = '{"geojson":{"type":"Polygon","coordinates":[[[-76.11328125,-26.273714024406416],[-76.201171875,-26.980828590472093],[-76.376953125,-27.527758206861883],[-76.81640625,-28.30438068296276],[-77.255859375,-28.767659105691244],[-77.87109375,-29.152161283318918],[-78.486328125,-29.45873118535532],[-79.189453125,-29.61167011519739],[-79.892578125,-29.6880527498568],[-80.595703125,-29.61167011519739],[-81.5625,-29.382175075145277],[-82.177734375,-29.07537517955835],[-82.705078125,-28.6905876542507],[-83.232421875,-28.071980301779845],[-83.49609375,-27.683528083787756],[-83.759765625,-26.980828590472093],[-83.84765625,-26.35249785815401],[-83.759765625,-25.64152637306576],[-83.583984375,-25.16517336866393],[-83.232421875,-24.447149589730827],[-82.705078125,-23.966175871265037],[-82.177734375,-23.483400654325635],[-81.5625,-23.241346102386117],[-80.859375,-22.998851594142906],[-80.15625,-22.917922936146027],[-79.453125,-22.998851594142906],[-78.662109375,-23.1605633090483],[-78.134765625,-23.40276490540795],[-77.431640625,-23.885837699861995],[-76.9921875,-24.28702686537642],[-76.552734375,-24.846565348219727],[-76.2890625,-25.48295117535531],[-76.11328125,-26.273714024406416]]]}}'

get_raster(
  spatial_resolution = 'low',
  temporal_resolution = 'yearly',
  group_by = 'flag',
  date_range = '2021-01-01,2021-12-31',
  region = region_json,
  region_source = 'user_json',
  key = key
  )
## # A tibble: 5 × 6
##     Lat   Lon `Time Range` flag  `Vessel IDs` `Apparent Fishing hours`
##   <dbl> <dbl>        <dbl> <chr>        <dbl>                    <dbl>
## 1 -24.7 -78.6         2021 ESP              2                   0.959 
## 2 -24.6 -78.4         2021 ESP              2                   0.276 
## 3 -24.2 -77.8         2021 ESP              1                   0.418 
## 4 -24.7 -78.5         2021 ESP              1                   0.0344
## 5 -27.3 -82           2021 ESP              1                   0.428

If you want raster data from a particular Exclusive economic zone (EEZ), you can use the get_region_id function to get the EEZ id, enter that code in the region argument of get_raster instead of the geojson data (ensuring you specify the region_source as ‘eez’:

Example: Use get_region_id function to get EEZ code of Cote d’Ivoire. And then get the raster of this eez

code_eez <- get_region_id(region_name = 'CIV', region_source = 'eez', key = key)

get_raster(spatial_resolution = 'low',
           temporal_resolution = 'yearly',
           group_by = 'flag',
           date_range = '2021-01-01,2021-10-01',
           region = code_eez$id,
           region_source = 'eez',
           key = key)
## # A tibble: 572 × 6
##      Lat   Lon `Time Range` flag  `Vessel IDs` `Apparent Fishing hours`
##    <dbl> <dbl>        <dbl> <chr>        <dbl>                    <dbl>
##  1   5.2  -4           2021 CHN              5                   501.  
##  2   5    -5.4         2021 CHN              3                    23.8 
##  3   4.5  -4.3         2021 GHA              1                     3.32
##  4   4    -4           2021 FRA              6                    24.8 
##  5   4    -3.9         2021 GTM              2                     6.99
##  6   3.9  -3.8         2021 GTM              2                     3.56
##  7   4.7  -6.3         2021 CHN              1                    32.0 
##  8   4.5  -4           2021 BLZ              2                     3.55
##  9   5.3  -4           2021 BES              4                    67.0 
## 10   4.6  -4           2021 FRA              2                     3.22
## # ℹ 562 more rows

You can also search for just one word in the name of the EEZ and then decide which one you want. And then use the id returned to retrieve raster for that region. Example: Raster for “french” something

(get_region_id(region_name = 'French', region_source = 'eez', key = key))
## # A tibble: 2 × 3
##      id iso3  label           
##   <dbl> <chr> <chr>           
## 1  8462 FRA   French Guiana   
## 2  8440 FRA   French Polynesia
get_raster(spatial_resolution = 'low',
           temporal_resolution = 'yearly',
           group_by = 'flag',
           date_range = '2021-01-01,2021-10-01',
           region = 8440,
           region_source = 'eez',
           key = key)
## # A tibble: 6,856 × 6
##      Lat   Lon `Time Range` flag  `Vessel IDs` `Apparent Fishing hours`
##    <dbl> <dbl>        <dbl> <chr>        <dbl>                    <dbl>
##  1 -13   -143          2021 PYF              3                   20.9  
##  2 -14.8 -144.         2021 PYF              1                    0.533
##  3 -15.1 -144.         2021 PYF              1                    0.766
##  4 -15   -144          2021 PYF              2                    6.66 
##  5 -13.7 -144.         2021 PYF              2                    9.26 
##  6 -13   -143.         2021 PYF              4                   14.7  
##  7 -13.5 -143.         2021 PYF              2                    1.73 
##  8 -13.5 -143          2021 PYF              3                    7.31 
##  9 -13   -143.         2021 PYF              2                    3.32 
## 10 -13.9 -142.         2021 PYF              2                    3.33 
## # ℹ 6,846 more rows

A similar approach can be used to search for a specific Marine Protected Area, in this case the Phoenix Island Protected Area (PIPA). You can use get_region_id as well for a moraine protected area as well.

code_mpa <- get_region_id(region_name = 'Phoenix', region_source = 'mpa', key = key)

get_raster(spatial_resolution = 'low',
           temporal_resolution = 'yearly',
           group_by = 'flag',
           date_range = '2015-01-01,2015-06-01',
           region = code_mpa$id[1],
           region_source = 'mpa',
           key = key)
## # A tibble: 40 × 6
##      Lat   Lon `Time Range` flag  `Vessel IDs` `Apparent Fishing hours`
##    <dbl> <dbl>        <dbl> <chr>        <dbl>                    <dbl>
##  1  -2.8 -176.         2015 KOR              1                    9.29 
##  2  -3.5 -176.         2015 KOR              1                    3.11 
##  3  -3.4 -176.         2015 KOR              1                    1.37 
##  4  -3.5 -176.         2015 KOR              1                   10.7  
##  5  -1   -170.         2015 KOR              1                    2.62 
##  6  -2.2 -176.         2015 KIR              1                    0.532
##  7  -3.9 -176.         2015 KOR              1                    4.88 
##  8  -4.1 -176.         2015 KOR              1                    1.57 
##  9  -4   -176.         2015 KOR              1                    1.37 
## 10  -2.9 -176.         2015 FSM              1                    2.77 
## # ℹ 30 more rows

Several examples to see how these information can be visualized

library(maps)
library(ggmap)
library(tidyverse)
library(plotly)

Example 1

Get the fishing history for one vessel named Victory by inputting its vessel id. We can plot where it travels using ggplot2 and plotly (a data visualization tool which is interactive and more powerful). We can simply call ggplotly to turn a ggplot2 plot into interactive plotly plot.

#Save the history into victory_track
victory_fishing_track <- get_event(event_type='fishing',
          vessel = "1428722dc-cb5c-7377-44ab-f0777c8b71fb",
          start_date = "2021-01-01",
          end_date = "2021-12-31",
          key = key
)
## [1] "Downloading 120 events from GFW"
#Draw a map with this area
area <- c(left = -100,
          right = -88,
          bottom = 25,
          top = 32)
v_map <- get_stamenmap(bbox = area, zoom = 7)

#Plot the coordinates on the map
v_final_map <- ggmap(v_map) + 
    geom_point(data = victory_fishing_track, aes(x = lon, y = lat)) + # Add coordinate data
    theme_bw() + # Change the plot theme
    ggtitle("Track of Victory fishing location in 2021") + # Give the plot a title
    xlab("Longitude") + # Change x axis title
    ylab("Latitude") # Change y axis title
ggplotly()

Example 2

#Save the history into victory_track
victory_port_track <- get_event(event_type='port_visit',
          vessel = "1428722dc-cb5c-7377-44ab-f0777c8b71fb",
          confidences = 4,
          start_date = "2021-01-01",
          end_date = "2021-12-31",
          key = key
)
## [1] "Downloading 23 events from GFW"
#Draw a map with this area
area <- c(left = -100,
          right = -88,
          bottom = 25,
          top = 32)
v_map <- get_stamenmap(bbox = area, zoom = 7)

#Plot the coordinates on the map
v_final_map <- ggmap(v_map) + 
    geom_point(data = victory_port_track, aes(x = lon, y = lat)) + # Add coordinate data
    theme_bw() + # Change the plot theme
    ggtitle("Track of Victory fishing location in 2021") + # Give the plot a title
    xlab("Longitude") + # Change x axis title
    ylab("Latitude") # Change y axis title
ggplotly()

Example 3

Get vessels information for those appearing in a EEZ region. Plot this location and see where vessels appeared in this region.

# Get the eez code for CIV
code_eez <- get_region_id(region_name = 'CIV', region_source = 'eez', key = key)

# Get the raster for CIV
civ_raster <- get_raster(spatial_resolution = 'low',
           temporal_resolution = 'yearly',
           group_by = 'flagAndGearType',
           date_range = '2021-01-01,2021-10-01',
           region = code_eez$id,
           region_source = 'eez',
           key = key)

# Draw a map with this area
area <- c(left = -8,
          right = -2,
          bottom = 0,
          top = 6)
r_map <- get_stamenmap(bbox = area, zoom = 7)

# Plot the appearance within this area on the map
raster_final_map <- ggmap(r_map) + 
    geom_point(data = civ_raster, aes(x = Lon, y = Lat)) + # Add coordinate data
    theme_bw() + # Change the plot theme
    ggtitle("Moving track of Victory in 2021") + # Give the plot a title
    xlab("Longitude") + # Change x axis title
    ylab("Latitude") # Change y axis title
ggplotly()

Example 4

For this region “CIV”, we would like to see where these vessels come from. With civ_raster, which is the raster for CIV we previously obtained, we can easily group the data by the Flag category. And then plot the flag distribution with plotly.

# Summarize the data to get the flag distribution using previous raster for CIV
flag_summary <- civ_raster %>%
  group_by(Flag) %>%
  summarise(count = n()) %>%
  mutate(percentage = count / sum(count) * 100)

# Create the pie chart
plot_ly(flag_summary, 
             labels = ~Flag, 
             values = ~count, 
             type = 'pie', 
             textposition = 'inside',
             textinfo = 'label+percent',
             insidetextorientation = 'radial')

Example 5

Similar with previous example, we can see the gear distribution for vessels in CIV.

# Summarize the data to get the Gear distribution
gear_summary <- civ_raster %>%
  group_by(Geartype) %>%
  summarise(count = n())

plot_ly(gear_summary, 
             labels = ~Geartype, 
             values = ~count, 
             type = 'pie', 
             textposition = 'inside',
             textinfo = 'label+percent',
             insidetextorientation = 'radial')